Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed a deadlock when printing surrogate pairs #4150

Merged
3 commits merged into from
Jan 15, 2020
Merged

Fixed a deadlock when printing surrogate pairs #4150

3 commits merged into from
Jan 15, 2020

Conversation

lhecker
Copy link
Member

@lhecker lhecker commented Jan 8, 2020

Summary of the Pull Request

See my code comment below for technical details of the issue that caused #4145.

PR Checklist

Detailed Description of the Pull Request / Additional comments

TBH I kinda hope this project could migrate to an internal use of UTF-8 in the future. 😶

Validation Steps Performed

Followed the "Steps to reproduce" in #4145 and ensured the "Expected behavior" happens.

@DHowett-MSFT
Copy link
Contributor

This has been a historically dangerous area to touch (already!) -- we have had a number of fixes and reversions in this function already.

@DHowett-MSFT
Copy link
Contributor

(I think #1360 is the same issue, actually)

@lhecker
Copy link
Member Author

lhecker commented Jan 8, 2020

You're right - You guys even found the exact same line. 😄

But I believe the conclusion in #1360 was slightly off / incomplete:
Sometimes (?) when you pass a surrogate pair consisting of [high,low] to TextBuffer::Write, the reported distance will be 1 and not 2. ⚠️

Given a surrogate pair in stringView the current code will detect a high surrogate in wch, extract 2 code units from stringView and give that to TextBuffer::Write. Now if the reported distance is 1, the loop counter i will not be incremented.
In the next iteration wch will then be the low (⚠️) surrogate of the code point which is fed into TextBuffer::Write again, whose reported distance is now 0, because the surrogate code unit is incomplete. Afterwards the loop will deadlock.

=> @DHowett-MSFT The actual fix AFAICS is to not use end.GetCellDistance(it) as a measurement whether i needs to be incremented (AFAICS it will only return 1 or 2 anyways).
Instead it should always be incremented by 1 if a surrogate pair is encountered.

@lhecker
Copy link
Member Author

lhecker commented Jan 8, 2020

Huh... @DHowett-MSFT @miniksa turns out that the cell distance will always be 1 for surrogate pairs?
When the OutputCellIterator is incremented the cell distance (_distance) is incremented by 1, whereas the input position (_pos) is incremented by 2, thus consuming the entire surrogate pair.
This leads me to believe that Terminal::_WriteBuffer probably deadlocks or corrupts most surrogate pairs right now, without this PR.

Here's the before/after with this PR with the string "𐐌𐐜𐐬" (3 surrogate pairs):
image

As you can see the current Terminal::_WriteBuffer method currently sadly breaks when faced with surrogates. 😕 This PR mostly fixes it though by not using the cell distance anymore. 🙃

@zadjii-msft
Copy link
Member

zadjii-msft commented Jan 8, 2020

Okay so this has historically been challenging - could we maybe add the tests from #2924 and a test for the #3052 case to this PR? I'd really hate to have another "simple" change in this area secretly bite us again

(I'm inclined to trust that this works better, but I'm more inclined to trust tests 😝)

@zadjii-msft zadjii-msft added Area-Output Related to output processing (inserting text into buffer, retrieving buffer text, etc.) Area-TerminalControl Issues pertaining to the terminal control (input, selection, keybindings, mouse interaction, etc.) Product-Terminal The new Windows Terminal. labels Jan 8, 2020
@miniksa
Copy link
Member

miniksa commented Jan 8, 2020

If we can put some tests in, we can give it another go. Does it also fix #1360, @lhecker?

@lhecker
Copy link
Member Author

lhecker commented Jan 8, 2020

Does it also fix #1360, @lhecker?

@miniksa Using the minimal reproduction program? Yup. 🙂
image

Copy link
Member

@miniksa miniksa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it also fix #1360, @lhecker?

@miniksa Michael Niksa FTE Using the minimal reproduction program? Yup. 🙂
image

Beautiful.

@lhecker
Copy link
Member Author

lhecker commented Jan 8, 2020

could we maybe add the tests from #2924

@zadjii-msft I've tried to adapt the referenced test case, but was unable to figure out how make it work in the first place. If I call Terminal::PrintString() with a string containing 300 "a" characters it never returns. snippet
I hope I can figure out tomorrow how to do whatever the runut CLI does, in Visual Studio so I can debug this better. Otherwise I might have to rely on someone helping me out a bit to adapt #2924's test.

@zadjii-msft
Copy link
Member

@lhecker ruh-roh
image

I just copy-pasted the test and built it locally, and it passed just fine. So maybe your change broke it 😅

Usually, when I'm debugging tests, I add a DebugBreak(); statement and let the post-mortem debugger of WinDbg break into the test process. I don't know if VS has a post-mortem debugger that would work similarly. @miniksa's the other primarily-VS user on the team, so he might have an idea.

@lhecker
Copy link
Member Author

lhecker commented Jan 8, 2020

So maybe your change broke it 😅

I tried it on the master branch - even after bcz dbg. 🤔 I must be doing something wrong.

Usually, when I'm debugging tests, I add a DebugBreak(); statement and let the post-mortem debugger of WinDbg break into the test process.

Thank you so much! I'll try fixing it using that tomorrow. 🤗

@miniksa
Copy link
Member

miniksa commented Jan 9, 2020

Usually, when I'm debugging tests, I add a DebugBreak(); statement and let the post-mortem debugger of WinDbg break into the test process. I don't know if VS has a post-mortem debugger that would work similarly. @miniksa Michael Niksa FTE's the other primarily-VS user on the team, so he might have an idea.

Go to Tools > Options and then make sure that Native is checked as the Just-In-Time Debugging provider. (Checking the box, if it is not checked, will require that Visual Studio is launched as Administrator.)

image

Then when you run something with DebugBreak() in it, you will see this:
image

The top ones will be new instances of the Visual Studios installed on your system. The bottom ones will be the running instances of Visual Studio. You can see I have one open already. If I choose the bottom one, VS will attach straight up as if I F5'd from the solution at the point from the DebugBreak(). Step up once to get out of the break and back into the code.

@DHowett-MSFT
Copy link
Contributor

Here's my trick!

image

image

Setting the command to te.exe with the arguments /inproc $(TargetPath) lets you run it under debug with F5. This doesn't work for every test, as some can't run inproc (TE usually spawns a child process to run tests, but /inproc makes it run them directly).

You can place and hit breakpoints and everything this way.

@miniksa
Copy link
Member

miniksa commented Jan 9, 2020

Here's my trick!

This is probably another good one for the #4163 treatment.

@lhecker
Copy link
Member Author

lhecker commented Jan 10, 2020

I added a test. Can you guys re-review? 🙂


The long explanation for the actual cause of the freeze:

The strings that get fed into Terminal::_WriteBuffer are always chopped into segments of as many characters as the terminal is wide (i.e. 120 chars for a 120 char wide terminal).
But since the string view contains code units and not characters, it'll contain more items than the terminal buffer is wide when surrogate pairs are involved (damn you UTF-16!).
Due to the bug of GetCellDistance() being used to increment i it'll now loop more than e.g. 120 times (namely twice for every surrogate pair).
This causes the buffer line to be full prematurely and GetCellDistance() to return 0.
--> The loop loops infinitely now.
--> Fixed by using GetInputDistance() and by adding a safety check ensuring newlines are inserted if a buffer line is already full. 👍


@zadjii-msft Thanks for making me write the test. I hardened the code a bit more now!
It also helped me understand the underlying issue and #1360 a bit better (as written above).

Regarding DebugBreak When I add a `DebugBreak()` statement as the first thing in a `TEST() { ... }` group I get the following error in my terminal: > A crash with exception code 0x80000003 occurred in module "KERNELBASE.dll" in the process hosting the test code while invoking a test operation

...and no window opens. ¯\_(ツ)_/¯
(Because I don't have much free time I just solved it temporarily by adding a Sleep(10000); instead and quickly attaching a debugger manually. #lifehacks 😄)

@lhecker
Copy link
Member Author

lhecker commented Jan 10, 2020

Hmm... Something strange is still happening. 🤔
If I print a file containing the character "𐐌" ~80 times the following happens in both the classical terminal, as well as the new one:
image

That doesn't happen with emojis. I guess that's another followup ticket?

@miniksa
Copy link
Member

miniksa commented Jan 10, 2020

Hmm... Something strange is still happening. 🤔
If I print a file containing the character "𐐌" ~80 times the following happens in both the classical terminal, as well as the new one:
image

That doesn't happen with emojis. I guess that's another followup ticket?

Yeah, I'd file it as a follow up. Though I'd be surprised if there isn't a dupe somewhere on the tracker. If it's happening in both, it's almost assuredly a conhost.exe bug.

@lhecker
Copy link
Member Author

lhecker commented Jan 14, 2020

I'm just writing to make sure this PR isn't forgotten. 😄
It only needs one more possible approval to be merged and fixes a "Priority-1" issue. 🙂

Copy link
Member

@zadjii-msft zadjii-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the test case!

@zadjii-msft zadjii-msft added the AutoMerge Marked for automatic merge by the bot when requirements are met label Jan 15, 2020
@ghost
Copy link

ghost commented Jan 15, 2020

Hello @zadjii-msft!

Because this pull request has the AutoMerge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

@ghost ghost merged commit 3e6b4b5 into microsoft:master Jan 15, 2020
@lhecker lhecker deleted the fix-surrogate-output branch January 15, 2020 23:31
DHowett-MSFT pushed a commit that referenced this pull request Jan 24, 2020
## Summary of the Pull Request

See [my code comment](#4150 (comment)) below for technical details of the issue that caused #4145.

## PR Checklist
* [x] Closes #1360, Closes #4145.
* [x] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/Terminal) and sign the CLA
* [ ] Tests added/passed
* [ ] Requires documentation to be updated
* [ ] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #xxx

## Detailed Description of the Pull Request / Additional comments

TBH I kinda hope this project could migrate to an internal use of UTF-8 in the future. 😶

## Validation Steps Performed

Followed the "Steps to reproduce" in #4145 and ensured the "Expected behavior" happens.

(cherry picked from commit 3e6b4b5)
@ghost
Copy link

ghost commented Jan 27, 2020

🎉Windows Terminal Preview v0.8.10261.0 has been released which incorporates this pull request.:tada:

Handy links:

@ghost
Copy link

ghost commented Feb 13, 2020

🎉Windows Terminal Preview v0.9.433.0 has been released which incorporates this pull request.:tada:

Handy links:

donno2048 added a commit to donno2048/terminal that referenced this pull request Sep 28, 2020
## Summary of the Pull Request

See [my code comment](microsoft/terminal#4150 (comment)) below for technical details of the issue that caused #4145.

## PR Checklist
* [x] Closes #1360, Closes #4145.
* [x] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/Terminal) and sign the CLA
* [ ] Tests added/passed
* [ ] Requires documentation to be updated
* [ ] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #xxx

## Detailed Description of the Pull Request / Additional comments

TBH I kinda hope this project could migrate to an internal use of UTF-8 in the future. 😶

## Validation Steps Performed

Followed the "Steps to reproduce" in #4145 and ensured the "Expected behavior" happens.
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Output Related to output processing (inserting text into buffer, retrieving buffer text, etc.) Area-TerminalControl Issues pertaining to the terminal control (input, selection, keybindings, mouse interaction, etc.) AutoMerge Marked for automatic merge by the bot when requirements are met Product-Terminal The new Windows Terminal.
Projects
None yet
4 participants